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CLAIMS 




What is claimed is: 

In a computer system, a method for clustering a plurality of datapoints, wherein 
each datapoint is a series of gene expression values, wherein the method 
comprises\ 

a) receiving the gene expression values of the datapoints; 

b) using a\elf organizing map, clustering the datapoints such that the 
datapointVthat exhibit similar patterns are clustered together into 
respective clusters; and 

10 c) providing an output indicating the clusters of the datapoints. 

2. The method of Claim 1 , wherein the gene expression values are obtained from a 
gene that is subjected to at least one condition. 

3. The method of Claim 2, the step of receiving includes receiving gene expression 
values of datasets, wherein a dataset is a series of gene expression values across 

1 5 multiple genes for a condition. 

4. The method of Claim 3, further comprising filtering out any datapoints that 
exhibit an insignificant change in the gene expression value, such that working 
datapoints remain. 



5. 

20 



The method of Claim 4, further comprising normalizing the gene expression 
value of the working datapoints. 
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6. The method of Claim 5, wherein the self organizing map is formed of a plurality 
of Nodes, N, and clusters the datapoints according to a competitive learning 
routine. 

7. The method of Claim 6, wherein the competitive learning routine is: 
5 f i+ i(N) = fi(N) + T(d(N,N P ), i) (P - fj(N)) 

wherein i = number of iterations, N= the node of the self organizing map, t = 
learning rate, P = the subject working datapoint, d = distance, N p = node that is 
mapped nearest to P, and fj(N) is the position of N at i. 

8. The method of Claim 1, wherein the step of providing includes displaying at 
10 least one representative datapoint from each cluster. 

9. The method of Claim 5, wherein the step of normalizing the gene expression 
value comprises determining the ratio of a) difference between the subject gene 
expression value and the average gene expression value across datasets, and b) 
the standard deviation of the gene expression value across datasets. 

15 10. The method of Claim 3, further comprising rescaling the gene expression values 
to account for variations across multiple conditions. 

y(f In a computet system, a method for grouping a plurality of datapoints, wherein 
(V V each datapointVs a series of gene expression values, wherein the method 
^ O / comprises: 

20 a) receiving j&ne expression values of the datapoints; 

b) filtering out atay datapoints that exhibit an insignificant change in the 
gene expressiomyalue, such that working datapoints remain; 

c) normalizing the ge&e expression value of the working datapoints; 
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d) using a self organizing map, grouping the working datapoints such that 
the datapoints that exhibit similar patterns are grouped together into 
respective clusters; and 

e) providing an output indicating the groups of the datapoints. 

5 12. The method of Claim 1 1 , wherein the gene expression values are obtained from a 
gene that is subjected to at least one condition. 



13. The method of Claim 12, the step of receiving includes receiving gene 

expression values of datasets, wherein a dataset is a series of gene expression 
values across multiple genes for a condition. 

10 14. The method of Claim 13, wherein the self organizing map is formed of a 

plurality of Nodes, N, and groups the datapoints according to a competitive 
learning routine. 

15. The method of Claim 14, wherein the competitive learning routine is: 

f i+1 (N) = fi(N) + t(d(N,N P ), i) (P - fi(N)) 

15 wherein i = number of iterations, N= the node of the self organizing map, x = 

learning rate, P = the subject working datapoint, d = distance, N p = node that is 
mapped nearest to P, and f-(N) is the position of N at i. 

16. The method of Claim 11, wherein the step of providing includes displaying at 
least one representative datapoint from each group. 



20 17. 



The method of Claim 13, wherein the step of normalizing the gene expression 
value comprises determining the ratio of a) difference between the subject gene 
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expression value and the average gene expression value across datasets, and b) 
the standard deviation of the gene expression value across datasets. 

1 8 . The method of Claim 1 1 , further comprising rescaling the gene expression values 
to account for variations across multiple conditions. 

5 \yf A computer apparatus for clustering a plurality of datapoints, wherein each 

datapoinms a series of gene expression values, wherein the apparatus comprises: 

a) a source of gene expression values of the datapoints; 

b) a processor routine coupled to receive datapoints from the source, the 
processor routine utilizing a self organizing map for clustering datapoints 

10 such that the datapoints that exhibit similar patterns are clustered together 

into respective clusters; and 

c) an output device, coupled to the processor routine, for indicating the 
clusters of the datapfoints. 



20. The apparatus of Claim 19, vmerein the gene expression values are obtained from 
15 a gene that is subjecjed'tcj at least one condition. 

21. The apparatus of Claim 20, wherein the source further provides datasets, each 
dataset is a series of gene expression values across multiple genes for a 
condition. 



22. The computer apparatus of Claim 21, further comprising a filter, coupled to the 
20 source, for filtering out any of the datapoints that exhibit an insignificant change 

in the gene expression value, such thatWorking datapoints remain. 
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23. The computer apparatus of Claim 22, further comprising a normalizing processor 
coupled to the filter, for normalizing the gene expression value of the working 
datapoints. 

\ 

24. The computer apparatus of Claim 23, wherein the normalizing process 

5 determines a normalized gene expression value according to the ratio of a) 

difference between the subject gene expression value and the average gene 
expression valueWross datasets, and b) the standard deviation of the gene 
expression value across datasets. 

25. The computer apparatus of Claim 24, wherein the self organizing map is formed 
10 of a plurality of Nodes, W, and clusters the datapoints according to a competitive 

learning routine. / 

26. The computer apparatus ck dlaim^25, wherein the competitive learning routine is: 
fi+i(N) = fi(N) + xid(^p), i) (P - ^(N)) 

\ 

wherein i = number of iterations, the node of the self organizing map, x = 
1 5 learning rate, P = the subject working datapoint, d = distance, N p = node that is 

mapped nearest to P, and fj(N) is the po,s^on of N at i. 

27. The computer apparatus of Claim 26, wherein the output device comprises a 
display of at least one representative datapoint from each cluster. 

A computer apparatus for grouping a plurality of datapoints, wherein each 
20 datapoint is a series of gene expression values, wherein the apparatus comprises: 

a) a source of gene expression values of the datapoints; 
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b) a filter, coupled to the source, for receiving the gene expression values 
and filtering out any of the datapoints that exhibit an insignificant change 
in the gene\expression value, such that working datapoints remain; 

c) a normalizing process, coupled to the filter, for normalizing the gene 
5 expression value of the working datapoints; 

d) a processor routine that is responsive to the normalizing process and 
utilizes a self organizing map for grouping the working datapoints such 
that the datapoints that exhibit similar patterns are grouped together into 
respective groups j .and 

10 e) an output device, coupled to the processor routine, for indicating the 

groups of the datapoints. 

29. The apparatus of Claim 28, wherein the gene expression values are obtained from 
a gene that is subjected to ataeast onfe condition. 




30. The apparatus of Claim 29, wherebrffigNsource further provides datasets, each 
1 5 dataset being a series of gen/expre^sipn values across multiple genes for a 

condition. 




3 1 . The computer apparatus of Claim 22, wherein the normalizing process of the 
gene expression value is determined according to the ratio of a) difference 
between the subject gene expression valuAand the average gene expression value 
20 across datasets, and b) the standard deviation of the gene expression value across 

datasets. 



32. The computer apparatus of Claim 3 1 , wherein the self organizing map is formed 
of a plurality of Nodes, N, and groups the datapoints according to a competitive 
learning routine. 
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33 . The computer apparatus of Claim 32, wherein the competitive learning routine is: 



f i+1 (N)^ f ; (N) + x(d(N,N P ), i) (P - fi(N)) 



\ 

wherein i = number of iterations, N= the node of the self organizing map, x = 
learning rate, P =^he subject working datapoint, d = distance, N p = node that is 
mapped nearest to P, and fj(N) is the position of N at i. 



34. The computer apparatus of Claim 33, wherein the output device comprises a 
display of at least one representative datapoint from each group. 

^ A method for assessing expression patterns of two or more genes in cells, 
wherein the expression patterns are represented by a plurality of datapoints, 
10 wherein each datapoint is a series (J gene expression values, wherein the method 

comprises: 

a) receiving the gene expression^alues of the datapoints; 

b) using a self organizing i mapfclustering the datapoints such that the 
datapoints that exhibit similar patterns are clustered together into 

1 5 respective clusters; \^ 

c) providing an output indicating the clusters of the datapoints; and 

\ 

d) analyzing the output to determine the similarities or differences between 
the expression patterns of the genes. 



36. The method of Claim 35, wherein the gene expression values are obtained from a 
20 gene that is subjected to at least one condition 



37. The method of Claim 36, wherein a dataset is a series of gene expression values 
across multiple genes for a condition. 



s^rk 
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38. The method of Claim 37, further comprising filtering out any datapoints that 
exhibit aAinsignificant change in the gene expression value, such that working 
datapoints remain. 

39. The method o\ Claim 38, further comprising normalizing the gene expression 
5 value of the wo\king datapoints. 

40. The method of ClWm 39, wherein the self organizing map is formed of a 
plurality of Nodes,\sT, and clusters the datapoints according to a competitive 
learning routine. 
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41. 



10 



42. 



15 



The method of Claim; 
f i+ ,(N) = fi( 



^rein the competitive learning routine is: 
(d(NJN P ), i) (P-fi(N)) 



wherein i = number of iterations, N£ the node of the self organizing map, x = 
learning rate, P = the subject working datapoint, d = distance, N p = node that is 
mapped nearest to P, andif(N)Vs the position of N at i. 

The method of Claim 39, whereinVhe step of normalizing the gene expression 
value comprises determining the rario of a) difference between the subject gene 
expression value and the average gen^ expression value across the datasets, and 
b) the standard deviation of the gene expression value across datasets. 



43. The method of Claim 35, further comprising rescaling the gene expression values 
to account for variations across multiple conditions. 



20 4 



A method for characterizing expression patterns W a plurality of genes of a 
sample having unknown characteristics, wherein the sample from an individual is 
obtained and subjected to a multiplicity of diagnostic tests, and the expression 
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patterns oAthe genes for the diagnostic tests are represented by a plurality of 
datapoints, wherein the datapoint is a series of gene expression values across 
multiple genes for the diagnostic test, wherein the method comprises: 

a) receiving the gene expression values of the datapoints from the diagnostic 
tests; 

b) using a ielf organizing map, clustering the datapoints such that the 
datapoints that exhibit similar patterns are clustered together into 
respective clusters; 

c) providing W output indicating the clusters of the datapoints; and 

d) comparing the output of the gene expression patterns of the unknown 
sample agaimst a control, 

thereby characterising gene^xpression patterns of the sample. 



15 



45. The method of Claim 
genes for the diagnostic 
condition. 



, wherein me gene expression values across multiple 
est is obtained from a gene subjected to at least one 



46. The method of Claim 45, wherein a dataset is a series of gene expression values 
from a gene subjected to the Diagnostic tests. 



20 



47. The method of Claim 46, wherein the sample from the individual is selected 
from the group consisting of: cells, lysed cells, cellular material suitable for 
determining gene expression, and material containing gene expression products. 



48. 



The method of Claim 47, further co: 
value of the datapoints. 



irising normalizing the gene expression 
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49. The method ofiClaim 48, wherein the self organizing map is formed of a 
plurality of Noaes, N, and clusters the datapoints according to a competitive 
learning routine. 

50. The method of Claify 49, wherein the competitive learning routine is: 

f i+ i(N) = «N)\+ T(d(N,N P ), i) (P - fj(N)) 

wherein i = number of iterations, N= the node of the self organizing map, x = 
learning rate, P = the sub\ect working datapoint, d = distance, N p = node that is 
mapped nearest to P, and nN) is the position of N at i. 



10 



5 1 . The method of Claim 50, wharemthe step of normalizing the gene expression 
value comprises determining pe raftio of a) difference between the subject gene 
expression value and the averag^gene expression value across datasets, and b) 
the standard deviation of tho ; 



Jtf£ A method of determining re 



15 



20 



geaeVexpression value across datasets. 



L at^dn£g^of expression patterns of two or more 
genes, wherein the expressioiU)atternaare represented by a plurality of 
datapoints, wherein each datapoint is a aeries of gene expression values, wherein 
the method comprises: 

a) receiving the gene expression values of the datapoints; 

b) using a self organizing map, clustering the datapoints such that the 
datapoints that exhibit similar patterns^re clustered together into 
respective clusters; 

c) providing an output indicating the clusters\pf the datapoints; and 

d) analyzing the output to determine the similarities and/or differences 
between the expression patterns of the genes,* 

thereby determining the relatedness of two or more gekes. 
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53. The method of GJlaim 52, wherein the gene expression values are obtained from a 
gene that is subjected to at least one condition. 

54. The method of Claim 53, wherein a dataset is a series of gene expression values 
across multiple geneafor a condition. 

5 55 . The method of Claim 5tt, further comprising filtering out any datapoints that 
exhibit an insignificant change in the gene expression value, such that working 
datapoints remain. 
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56. The method of Claim 55, further comprising normalizing the gene expression 
value of the working datapoints. 



10 57. 



15 



The method of Claim 56, wj*£?pin th§/ self organizing map clusters the datapoints 
according to: 

f i+ i(N) = S(N) + T(d(lj^ i) (P j f s (N)) 

wherein i = number of iterat/ons, J>ff the node of the self organizing map, t = 
learning rate, P = the subje^t^orkink datapoint, d = distance, N p = node that is 
mapped nearest to P, and f|(N) is the position of N at i. 



20 



A method of identifying a drug target from the expression patterns of two or 
more genes from cells, the expression patterns are represented by a plurality of 
datapoints, and wherein each datapoint is aperies of gene expression values, 
wherein the method comprises: 

a) obtaining cells that express genes, 

b) subjecting the cells to an agent or condition for testing the drug target, 

c) measuring gene expression from the cells subjected to the agent or 
condition, and from a control, to obtain the gene expression values, 
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d) receiving theWene expression values of the datapoints; 

e) using a self organizing map, clustering the datapoints such that the 
datapoints that exhibit similar patterns are clustered together into 
respective clusters; 

f) comparing the clusters from the genes that have been subjected to the 
agents or condition with a control; and 

g) providing an outpu\ indicating clusters, to thereby determine the drug 
target. 
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59. The method of Claim 58, ftp*lpxomprising filtering out any datapoints that 
10 exhibit an insignificanj^hang^\tn^he gene expression value, such that working 

datapoints remain. 

60. The method of Claim 59, fuijther Uprising normalizing the gene expression 
value of the working datagoj. 

61 . The method of Claim 60, wherein th^ self organizing map clusters the datapoints 
15 according to: 

f i+1 (N) = fj(N) + x(d(N,N P ), i) (ft - f s (N)) 

wherein i = number of iterations, N= the Vode of the self organizing map, x = 
learning rate, P = the subject working datajooint, d = distance, N p = node that is 
mapped nearest to P, and fj(N) is the position of N at i. 



